Lazy Credal Classifier and how to compare credal classifiers
نویسنده
چکیده
This poster carries out two main contributions: (a) a lazy (or local) version of naive credal classifier (NCC) that we call lazy naive credal classifier (LNCC); (b) two metrics to compare credal classifiers. NCC [1] has extended naive Bayes (NB) to imprecise probabilities, by modeling prior ignorance via the Imprecise Dirichlet Model; the classification is eventually issued by returning the non-dominated classes; therefore, NCC returns a set of classes when faced with instances whose classification would be prior-dependent for NB. Extensive experiments have shown that NCC is more reliable than NB. Yet, two drawbacks of NCC are (i) that the naive assumption (statistical independence of the features given the class) might be too simplicistic and (ii) that in some cases NCC becomes too indeterminate. We address the two issues above proposing the local naive credal classifier (LNCC); this addresses point (i) because working locally reduces the chance of encountering strong dependencies. In addition, LNCC should improve also the determinacy of NCC, thus addressing (ii): by working locally, it selects the part of the learning set that is more informative about the istance to be classified. How do we select the number of instances to be used to train the local classifier? We keep on including instances in the local learning set until NCC starts issuing a determinate classification on the instance to classify (note that this clearly favors removing indeterminacy). The rationale behind this criterion is that the we select a local learning set that is informative enough to draw a strong conclusion, such as a determinate classification. We investigate the effect of the above choices by extensive experiments to compare LNCC with NCC. In order to compare LNCC and NCC, we propose (a) an indicator borrowed from multi-label classification and (b) a non-parametric rank test. To our knowledge, this is the first attempt to empirically compare credal classifiers. Results on 36 data sets show that, according to both tests, LNCC clearly outperforms NCC, as it significantly reduces indeterminacy without worsening (often improving) the overall accuracy. Acknowledgments Work partially supported by the Swiss NSF grant n. 200021-118071/1.
منابع مشابه
Likelihood-Based Naive Credal Classifier
Bayesian Classifiers Learn joint distribution P(C,F) Assign to f the most probable class label argmaxc′∈C P(c′, f̃) This defines a classifier, i.e., a map: (F1× . . .×Fm)→ C Credal Classifiers Learn joint credal set P(C,F) Set of optimal classes (e.g., according to maximality ) {c′ ∈ C |@c′′ ∈ C ,∀P ∈ P : P(c′′|f̃) > P(c′|f̃)} This defines a credal classifier, i.e., (F1× . . .×Fm)→ 2 May return mo...
متن کاملCredal Classification based on AODE and compression coefficients
Bayesian model averaging (BMA) is a common approach to average over alternative models; yet, it usually gets excessively concentrated around the single most probable model, therefore achieving only sub-optimal classification performance. The compression-based approach (Boullé, 2007) overcomes this problem; it averages over the different models by applying a logarithmic smoothing over the models...
متن کاملBayesian Networks with Imprecise Probabilities: Theory and Application to Classification
Bayesian network are powerful probabilistic graphical models for modelling uncertainty. Among others, classification represents an important application: some of the most used classifiers are based on Bayesian networks. Bayesian networks are precise models: exact numeric values should be provided for quantification. This requirement is sometimes too narrow. Sets instead of single distributions ...
متن کاملCredal Classification for Mining Environmental Data
Classifiers that aim at doing reliable predictions should rely on carefully elicited prior knowledge. Often this is not available so they should be able to start learning from data in condition of prior ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Common ways to represent prior ignorance ...
متن کاملTree-Based Credal Networks for Classification
Bayesian networks are models for uncertain reasoning which are achieving a growing importance also for the data mining task of classification. Credal networks extend Bayesian nets to sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classification, called tree-augmented naive Bayes classifier, to credal sets originated from probability intervals. This...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009